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Abstract 

This paper describes four methods for estimating autocorrelation time and evaluates these methods 
with a test set of seven series. Fitting an autoregressive process appears to be the most accurate method 
of the four. An R package is provided for extending the comparison to more methods and test series. 

1 Introduction 
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^""^ Autocorrelation time measures the convergence rate of the sample mean of a function of a stationary, geo- 

metrically ergodic Markov chain with bounded variance. Let {A^}™, be n values of a scalar function of the 
C/3 states of such a Markov chain, where EXi = /i and var X{ = a 2 . Let X n be the sample mean of (X%, . . . , X n ). 

Then, the autocorrelation time is the value of r such that: 

£ /.fc^iV(0,l) (1) 

Conceptually, the autocorrelation time is the number of Markov chain transitions equivalent to a single 
y—\ independent draw from the distribution of {A^}. It can be a useful summary of the efficiency of a Markov 

chain Monte Carlo (MCMC) sampler. Since there is rarely a closed-form expression for the autocorrelation 
j—i time, many methods for estimating it from the data itself have been developed. This document compares 

four of them: computing batch means, fitting a linear regression to the log spectrum, summing an initial 
sequence of the sample autocorrelation function, and modeling as an autoregressive process. 



For a broader context, two classic overviews of uncertainty in MCMC estimation are Geyer (1992) and 



Neal ( 1993| §6.3). The most popular alternative to autocorrelation time, potential scale reduction, is dis- 



cussed by German and Rubin (1992) 



2 Methods for computing the autocorrelation time 
2.1 Batch means 

Equation [l] is approximately true for any subsequence, so we can divide the {A^} into batches of size m and 
compute the sample mean of each batch. As m goes to infinity, the batch means have asymptotic variance 



1 



a 2 ■ t /m. So, if s 2 is the sample variance of {Xi} and s 2 ^ is the sample variance of the batch means, we can 



estimate the autocorrelation time with: 



(2) 



For this to be a consistent estimator of r, the batch size and the number of batches must go to infinity. To 
ensure this, I use n 1 / 3 batches of size n 2 / 3 . Fishman (1978 §5.9) discusses batch means in great detail; Neal 



(1993 p. 104) and Geyer (1992 §3.2) discuss batch means in the context of MCMC. 



2.2 Fitting a linear regression to the log spectrum 

In the spectrum fit method ( Heidelberger and Welch 1981| §2.4), a linear regression is fit to the lower fre- 
quencies of the log-spectrum of the chain and used to compute an estimate of the spectrum at frequency zero, 
which we will denote by Iq. Let s 2 be the sample variance of the {A^}. We can estimate the autocorrelation 
time with: 

r=k (3) 

This is implemented by the spectrumO function in R's CODA package (Plummer et al. 2006[ ). First and 
second order polynomial fits are practically indistinguishable on the test series described in section |3j so this 
paper only explicitly includes the results for a first order fit. 



2.3 Initial sequence estimators 

One formula for the autocorrelation time is (Straatsma et al. 1986| p. 91): 



(4) 
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where pk is the autocorrelation function (ACF) of the series at lag k. For small lags, pt can be estimated 
with: 
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Pk 
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(Xi — X n )(Xi + k — x n ) 



(5) 



But, this is not defined for lags greater than n — 1, and the partial sum pi + ■ ■ ■ + p n -\ does not have a 
variance that goes to zero as n goes to infinity, so substituting all values from equation [5] into equation [4] is 
not an acceptable way to estimate r. 

Geyer (1992) shows that, for reversible Markov chains, sums of pairs of consecutive ACF values, Pi+Pi+i, 



are always positive, so one can obtain a consistent estimator, the initial positive sequence (IPS) estimator, 
by truncating the sum when the sum of adjacent sample ACF values is negative. 

Further, the sequence of sums of pairs is always decreasing, so one can smooth the initial positive sequence 
when a sum of two adjacent sample ACF values is larger than the sum of the previous pair; the resulting 
estimator is the initial monotone sequence (IMS) estimator. Finally, the sequence of sums of pairs is also 
convex. Smoothing the sum to account for this results in the initial convex sequence (ICS) estimator. 

I was unable to distinguish the behavior of the IPS, IMS, and ICS estimators, so the comparisons of this 
paper only directly include the ICS estimator. 
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2.4 AR process 



Another way to estimate the autocorrelation time, based on CODA's spectrumO . ar (Plummer et al. 2006 1, 



is to model the series as an AR(p) process, with p chosen by AIC. We suppose a model of the form: 

X t = n + niXf-i H h Tr p Xt-p + at, at ~ N(0, al) 



(6) 



Let pi.p be the vector (pi, . . . , p p ). We can obtain an estimate the AR coefficients, tti :p , with the Yule-Walker 



method fWei| |2006| pp. 136-138). Combining equations 7.1.5 and 12.2.8b of |Wei| ( |2006[ pp. 137, 274-275), 
we can estimate the autocorrelation time with: 



1 " Pl-.pKl-.p 
(l-l?*l:p) 2 



(7) 



Because the Yule-Walker method estimates the asymptotic variance of Tti- P , Monte Carlo simulation can be 
used to generate a confidence interval for the autocorrelation time. For more information on this method, 



sec Fishman (1978 §5.10) and Thompson (2010). 



3 Seven test series 

I evaluate the methods of section [2] with seven series. The first 10,000 elements of each are plotted in figure[T] 

• AR(1) is an AR(1) process with autocorrelation equal to 0.98 and uncorrelated errors. Using equa- 
tion [7j its autocorrelation time is 99. It is, in one sense, the simplest possible series with an autocor- 
relation time other than one. 

• AR(2) is the AR(2) process with poles at —1 + O.li and —1 — O.li: 

Z t = 1-98 Z t -i- 0.99 Z t - 2 + <H, Ot~JV(0,l) (8) 

Despite oscillating with a period of about 60, the terms in its autocorrelation function nearly cancel 
each other out, so its autocorrelation time is approximately two. This series is included to identify 
methods that cannot identify cancellation in long-range dependence. 

• AR(1)-ARCH(1) is an AR(1) process with autocorrelation 0.98 and autocorrelated errors: 

Z t = 0.98 Z t -! + a t , a t ~ N (0, 0.01 + 0.99 al_ x ) (9) 

Its autocorrelation time is also 99. This series could confound methods that assume constant step 
variance. 

• Met-Gauss is a sequence of states generated by a Metropolis sampler with Gaussian proposals sam- 
pling from a Gaussian probability distribution. It has an autocorrelation time of eight. It is a simple 
example of a state sequence from an MCMC sampler. 
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Figure 1: The first 10,000 elements of seven series used as a test set. See section [3] for more information. 
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Figure 2: Autocorrelation times computed by four methods on seven series for a range of subsequence 
lengths. Each plot symbol represents a single estimate. The dashed line indicates the true autocorrelation 
time. Breaks in the lines for spectrum fit indicate instances where the method failed to produce an estimate. 
See section |4] for discussion. 



ARMS-Bimodal is also a sequence of states from an MCMC simulation, ARMS (Gilks et al. 1995) 
applied to a mixture of two Gaussians. The sampler is badly tuned, so it rarely accepts a proposal 
when near the upper mode. It has an autocorrelation time of approximately 200. 

Stepout-Log-Var is a third MCMC sequence, the log-variance parameter of slice sampling with 



stepping out (Neal 2003| §4) sampling from a ten-parameter multilevel model (Gelman et al. 2004 



pp. 138-145). Its autocorrelation time is approximately 200. 

Stepout-Var was generated by exponentiating the states of Stepout-Log-Var. Because its sample 
mean is dominated by large observations, its autocorrelation time, approximately 100, is not the same 
as that of Stepout-Log-Var. Like ARMS-Bimodal and Stepout-Log-Var, it is a challenging, real-world 
example of a sequence whose autocorrelation time might be unknown. 
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Figure 3: The autocorrelation function of the AR(2) series. See section [I] for discussion. 

4 Comparison of methods 

For each series, I compare the true autocorrelation time to the autocorrelation time estimated by each 
method for subsequences ranging in length from 10 to 500,000. The results are plotted in figure [5] All four 
methods converge to the true value on all seven chains, with the exception of ICS on the AR(2) process. In 
general, the AR process method converges to the true autocorrelation time fastest, followed by ICS, batch 
means, and finally spectrum fit, which does not even produce an estimate in all cases. All four methods tend 
to underestimate autocorrelation times for all short sequences except those from the AR(2) process, which 
appears in small samples to have a longer autocorrelation time than it actually does. 

ICS is inconsistent on the AR(2) process because the autocorrelation function of the AR(2) process is 
significant at lags larger than its first zero-crossing (see figure [3]). ICS and the other initial sequence estimators 
are not necessarily consistent when the underlying chain is not reversible; an AR(2) process is not, when 
considered a two state system. These estimators stop summing the sample autocorrelation function the first 
time its values fall below zero, so these estimators never see that values at large lags cancel values at small 
lags. 

The AR process method can also generate approximate confidence intervals for the autocorrelation time. 
95% confidence intervals generated by the AR process method are shown in figure [4| The intervals are 
somewhat reasonable for subsequences longer than the true autocorrelation time. An exception is on AR(2), 
where the first few hundred elements do not represent the entire series but appear as though they might. 

5 Discussion 

The AR process method is more accurate than the three other methods and is the only method that, as 
implemented, generates confidence intervals, so I usually prefer it to the other three. The batch means 
estimate is faster to compute and almost as accurate, so it may be useful when speed is important and 
confidence intervals are not necessary. 
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Figure 4: AR process-generated point estimates and 95% confidence intervals for the autocorrelation times 
of seven series for a range of subsequence lengths. The dashed line indicates the true autocorrelation time. 
The confidence intervals are somewhat reasonable. See section [4] for discussion. 
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The test set only contains seven series and reflects the type of data I encounter in my own work. To 
extend the comparison to other methods and series, one can use ACTCompare, an R package available at 
http://www.utstat.toronto.edu/mthompson 
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